Text Classification by Aggregation of SVD Eigenvectors

نویسندگان

  • Panagiotis Symeonidis
  • Ivaylo Kehayov
  • Yannis Manolopoulos
چکیده

Text classification is a process where documents are categorized usually by topic, place, readability easiness, etc. For text classification by topic, a well-known method is Singular Value Decomposition. For text classification by readability, “Flesh Reading Ease index” calculates the readability easiness level of a document (e.g. easy, medium, advanced). In this paper, we propose Singular Value Decomposition combined either with cosine similarity or with Aggregated Similarity Matrices to categorize documents by readability easiness and by topic. We experimentally compare both methods with Flesh Reading Ease index, and the vector-based cosine similarity method on a synthetic and a real data set (Reuters-21578). Both methods clearly outperform all other comparison partners.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Extraction of Visual Evoked Potentials Using Wavelet Transform and Singular Value Decomposition

Introduction: Brain visual evoked potential (VEP) signals are commonly known to be accompanied by high levels of background noise typically from the spontaneous background brain activity of electroencephalography (EEG) signals. Material and Methods: A model based on dyadic filter bank, discrete wavelet transform (DWT), and singular value decomposition (SVD) was developed to analyze the raw data...

متن کامل

A New Coding System for Monochromatic Images Based on Wavelet Transform and Singular Value Decomposition (HDWTSVD)

In this paper the HDWTSVD algorithm to encode monochromatic images is proposed. The algorithm combines DWT and SVD techniques. The input image is divided into tiles of 64x64 pixels. A criterion based on the average standard deviation of 8x8 subblocks is used to choose DWT or SVD. If the tile exhibits a high average standard deviation, it is compressed by using SVD otherwise by DWT. Eigenvalues ...

متن کامل

Improved automatic target recognition using singular value decomposition

A new algorithm is presented for Automatic Target Recognition (ATR) where the templates are obtained via Singular Value Decomposition (SVD) of High Range Resolution (HRR) profiles. SVD analysis of a large class of HRR data reveals that the Range-space eigenvectors corresponding to the largest singular value accounts for more than 90% of target energy. Hence, it is proposed that the Range-space ...

متن کامل

Conceptually Co-occurring Words Included as Feature Selection in Text Document Classification using SVD and SVM

Document classification is a means of knowledge extraction in text mining process. This has been experimented by so many researches. But still we have included one of the extra features in the preprocessing phase and checked its outcome with Support Vector Machine (SVM). The feature selection has been accounted with the Vector space method, Single Value Decomposition (SVD) which is specifically...

متن کامل

Face Recognition Using Matrix Decomposition Technique Eigenvectors and SVD

Principle Component Analysis (PCA) is an important and well-known technique of face recognition, where eigenvectors are used. In this paper, we propose a face recognition technique, which combines Eigenvectors with Singular Value Decomposition (SVD) techniques to reduce size of the Eigen-matrix. The detailed theoretical derivation and analysis are presented and a simulation results on Olivetti ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012